TIL contexture predicts checkpoint blockade response in melanoma patients
In this case study, we will use ProjecTILs to interpret human scRNA-seq T cell data in the context of a murine TIL atlas. We are going to use the single-cell dataset generated by Sade-Feldman GSE120575 to illustrate interspecies mapping of human cells on a stable murine atlas using gene orthologs.
Background
In the study by Sade-Feldman et al. (2018) Cell, the authours characterize transcriptional profiles of immune cells from melanoma patients before and after immune checkpoint blockade, with the goal to identify factors that associate with success or failure of checkpoint therapy. They reported that certain CD8+ T cell states associate with clinical outcome, and in particular that the transcription factor TCF7 is a significant marker for response to checkpoint therapy. This suggests that the state of T cells found within a tumor is critical for induction of effective immune tumor rejection, highlighting the importance of characterizing (and potentially intervening on) the diversity of tumor-infiltrating T cells.
The single-cell expression data GSE120575 consists of CD45+ single cells from 48 tumor biopsies of melanoma patients before or after checkpoint inhibitors treatment, and sequenced using the Smart-seq2 protocol. Meta-data on response to therapy (Responder vs. Non-responder) is also available on the same GEO identifier.
Here we use the development version of ProjecTILs >0.4.1, which implements ortholog gene conversion between human and mouse.
R Environment
Check & load R packages
Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS="true")
if (!requireNamespace("renv"))
install.packages("renv")
library(renv)
renv::restore()
#Load development version of ProjecTILs
remotes::install_git("https://gitlab.unil.ch/carmona/ProjecTILs.git", ref = "v0.4.1")
library(ProjecTILs)
library(gridExtra)scRNA-seq data preparation
Download the count matrix and metadata from Gene Expression Omnibus (GEO), and store as Seurat object.
cached.object <- "SadeFeldman.seurat.rds"
if(!file.exists(cached.object)){
library(GEOquery)
geo_acc <- "GSE120575"
datadir <- "input/SadeFeldman"
gse <- getGEO(geo_acc)
series <- paste0(geo_acc, "_series_matrix.txt.gz")
system(paste0("mkdir -p ", datadir))
getGEOSuppFiles(geo_acc,baseDir=datadir)
##Load expression matrix and metadata
exp.mat <- read.delim(sprintf("%s/%s/GSE120575_Sade_Feldman_melanoma_single_cells_TPM_GEO.txt.gz", datadir, geo_acc), header=F, sep="\t")
genes <- exp.mat[c(-1,-2),1]
cells <- as.vector(t(exp.mat[1,2:16292]))
samples <- as.factor(t(exp.mat[2,2:16292]))
exp.mat <- exp.mat[c(-1,-2), 2:16292]
colnames(exp.mat) <- cells
rownames(exp.mat) <- genes
meta <- read.delim(sprintf("%s/%s/GSE120575_patient_ID_single_cells.txt.gz", datadir, geo_acc), header = T, sep = "\t", skip = 19, nrows = 16291)
meta <- meta[,1:7]
treat <- factor(ifelse(grepl("Post", samples),'Post','Pre'))
response <- factor(meta$characteristics..response)
therapy <- factor(meta$characteristics..therapy)
##Create Seurat object and add meta data
query.object <- CreateSeuratObject(counts = exp.mat, project = "SadeFeldman", min.cells = 10, min.features = 50)
rm(exp.mat)
query.object@meta.data$Sample <- samples
query.object@meta.data$Time <- treat
query.object@meta.data$Response <- response
query.object@meta.data$Therapy <- therapy
saveRDS(query.object, file=cached.object)
} else {
query.object <- readRDS(cached.object)
}Some basic statistics - cells per group (Pre vs. Post, Therapy, Responder vs. Non-responder)
Post Pre
10363 5928
anti-CTLA4 anti-CTLA4+PD1 anti-PD1
517 4121 11653
Non-responder Responder
10727 5564
Post_P1 Post_P1_2 Post_P10
291 353 308
Post_P10_T_enriched Post_P11 Post_P12
88 313 323
Post_P13 Post_P13_T_enriched Post_P14
265 184 329
Post_P15 Post_P16 Post_P16_T_enriched
304 369 94
Post_P17 Post_P17_myeloid_enriched Post_P18
238 43 357
Post_P19 Post_P19_myeloid_enriched Post_P2
328 75 301
Post_P2_T_enriched Post_P20 Post_P21
79 358 213
Post_P22 Post_P23 Post_P23_2
260 358 362
Post_P28 Post_P28_2 Post_P3
369 372 359
Post_P3_2 Post_P30 Post_P4
358 369 421
Post_P4_T_enriched Post_P5 Post_P5_2
254 292 336
Post_P6 Post_P6_T_enriched Post_P7
308 92 231
Post_P8 Post_P8_T_enriched Pre_P1
327 82 229
Pre_P12 Pre_P15 Pre_P2
330 304 337
Pre_P20 Pre_P24 Pre_P25
323 338 371
Pre_P26 Pre_P27 Pre_P28
333 343 337
Pre_P29 Pre_P3 Pre_P31
163 245 351
Pre_P33 Pre_P35 Pre_P4
452 238 311
Pre_P6 Pre_P7 Pre_P8
288 368 267
Select only baseline (Pre-treatment) samples
Post_P1 Post_P1_2 Post_P10
0 0 0
Post_P10_T_enriched Post_P11 Post_P12
0 0 0
Post_P13 Post_P13_T_enriched Post_P14
0 0 0
Post_P15 Post_P16 Post_P16_T_enriched
0 0 0
Post_P17 Post_P17_myeloid_enriched Post_P18
0 0 0
Post_P19 Post_P19_myeloid_enriched Post_P2
0 0 0
Post_P2_T_enriched Post_P20 Post_P21
0 0 0
Post_P22 Post_P23 Post_P23_2
0 0 0
Post_P28 Post_P28_2 Post_P3
0 0 0
Post_P3_2 Post_P30 Post_P4
0 0 0
Post_P4_T_enriched Post_P5 Post_P5_2
0 0 0
Post_P6 Post_P6_T_enriched Post_P7
0 0 0
Post_P8 Post_P8_T_enriched Pre_P1
0 0 229
Pre_P12 Pre_P15 Pre_P2
330 304 337
Pre_P20 Pre_P24 Pre_P25
323 338 371
Pre_P26 Pre_P27 Pre_P28
333 343 337
Pre_P29 Pre_P3 Pre_P31
163 245 351
Pre_P33 Pre_P35 Pre_P4
452 238 311
Pre_P6 Pre_P7 Pre_P8
288 368 267
ProjecTILs
Load reference TIL atlas - if it’s not present in the working directory, it will be downloaded from the repository
[1] "Loading Default Reference Atlas..."
[1] "/Users/admin/gitHub/ProjecTILs_CaseStudies/ref_TILAtlas_mouse_v1.rds"
[1] "Loaded Reference map ref_TILAtlas_mouse_v1"
Run ProjecTILs projection algorithm - version 0.4.1 supports human ortholog conversion (human.ortho=T)
[1] "Using assay RNA for query"
[1] "1685 out of 5928 ( 28 % ) non-pure T cells removed. Use filter.cells=FALSE to avoid pre-filtering (NOT RECOMMENDED)"
[1] "Transforming expression matrix into space of mouse orthologs"
[1] "Aligning query to reference map for batch-correction..."
Projecting corrected query onto Reference PCA space
[1] "Projecting corrected query onto Reference UMAP space"
Plot global projection of human TIL data over the reference in UMAP space (non-T cell transcriptomes have been automatically removed)
Predict the cell states in the query set
query.projected <- cellstate.predict(ref=ref, query=query.projected)
table(query.projected$functional.cluster)
CD4_NaiveLike CD8_EarlyActiv CD8_EffectorMemory CD8_NaiveLike
82 413 789 1071
CD8_Tex CD8_Tpex Tfh Th1
583 188 170 612
Treg
335
Interestingly, expression of marker genes of projected human TILs correspond fairly well to those of the murine reference atlas: e.g. Pdcd1, Havcr2 and Entpd1 expression in terminally exhausted CD8 TILs (CD8_Tex), co-expression of Pdcd1, Tox and Tcf7 (at a modest level) in the CD8_Tpex state with a higher expression of Ifng; high expression of Tcf7, Ccr7 with low expression of Pdcd1 or cytotoxic molecules in the Naive-like states (that might include central memory cells); Cxcr5 and Tox coexpression in follicular helper T cells; Foxp3, Tox, Havcr2, Entpd1 in regulatory CD4 T cells, etc.
query.list <- SplitObject(query.projected, split.by = "Response")
plot.states.radar(ref, query=query.list,min.cells = 50,
genes4radar = c("Foxp3","Cd4","Cd8a","Tcf7","Ccr7","Gzmb","Pdcd1","Havcr2","Tox","Entpd1","Cxcr5","Ifng"))Note that projections and comparisons are performed in the ortholog space of murine genes - to check the names of human-mouse orthologs you can examine the conversion table for genes of interest:
data(Hs2Mm.convert.table)
which.genes <- c("TCF7","GZMB","CD8B","PDCD1")
Hs2Mm.convert.table[Hs2Mm.convert.table$Gene.HS %in% which.genes, ] Gene.stable.ID.HS Gene.HS Gene.MM Alt.symbol
7381 ENSG00000254126 CD8B Cd8b1 Cd8b1
7835 ENSG00000188389 PDCD1 Pdcd1 Pdcd1
11096 ENSG00000081059 TCF7 Tcf7 Tcf7
16688 ENSG00000100453 GZMB Gzmb Gzmd,Gzme,Gzmf,Gzmc,Gzmn,Gzmg,Gzmb
Alt.symbol.HS
7381 CD8B2,CD8B
7835 PDCD1
11096 TCF7
16688 GZMB
Response to therapy
Now to the interesting part. Let’s visualize the projection and TIL contexture of tumors that responded or not responded following immune checkpoint blockade:
query.list <- SplitObject(query.projected, split.by = "Response")
pll <- list()
pll[[1]] <- plot.projection(ref, query.list[["Responder"]]) + ggtitle("Responder")
pll[[2]] <- plot.statepred.composition(ref, query.list[["Responder"]], metric="Percent") + ggtitle("Responder") + ylim(0,40)
pll[[3]] <- plot.projection(ref, query.list[["Non-responder"]]) + ggtitle("Non-responder")
pll[[4]] <- plot.statepred.composition(ref, query.list[["Non-responder"]], metric="Percent") + ggtitle("Non-responder") + ylim(0,40)
grid.arrange(grobs=pll, ncol=2, nrow=2, widths=c(1.5,1))In non-responders, there is a clear enrichment in the terminally exhausted CD8 state (CD8_Tex), while responders are enriched in Naive-like states.
To better examine differences in TIL contexture between responders and non-responders, we can visualize the fold-change of T cell state frequency between the two groups:
which.types <- table(query.projected$functional.cluster)>20
stateColors_func <- c("#edbe2a","#A58AFF","#53B400","#F8766D","#00B6EB","#d1cfcc","#FF0000","#87f6a5","#e812dd")
states_all <- levels(factor(ref$functional.cluster))
names(stateColors_func) <- states_all
cols_use <- stateColors_func[names(which.types)][which.types]
#Responder vs non Responder
query.list <- SplitObject(query.projected, split.by = "Response")
norm.c <- table(query.list[["Non-responder"]]$functional.cluster)/sum(table(query.list[["Non-responder"]]$functional.cluster))
norm.q <- table(query.list[["Responder"]]$functional.cluster)/sum(table(query.list[["Responder"]]$functional.cluster))
foldchange <- norm.q[which.types]/norm.c[which.types]
foldchange <- sort(foldchange,decreasing = T)
tb.m <- melt(foldchange)
colnames(tb.m) <- c("Cell_state","Fold_change")
pll <- list()
ggplot(tb.m, aes(x=Cell_state, y=Fold_change, fill=Cell_state)) + geom_bar(stat="identity") +
scale_fill_manual(values=cols_use) + geom_hline(yintercept = 1) + scale_y_continuous(trans='log2') +
theme(axis.text.x=element_blank(), legend.position="left") + ggtitle("Responder vs. Non-responder") CD4 and CD8 Naive-like states are the most enriched in responders compared to non-responders, while terminally exhausted Entpd1+ Havcr2+ CD8 TILs are the most under-represented, confirming the observation of the original paper that a higher frequency of TCF7+ CD8 TILs is associated with response to immune checkpoint therapy.
Conclusions
Taking advantage of the ortholog mapping functionality of ProjecTILs, we have illustrated how to effortlessly analyze human scRNA-seq data in the context of a reference murine TIL atlas. Gene expression profiles confirmed that T cells are accurately projected in major CD4+ and CD8+ categories, as well as in more specific subtypes (CD8_Tex, CD8_Tpex, naive-like, Follicular helper, Th1, and regulatory CD4 cells). Comparing the baseline transcriptomics profile of TILs from melanoma patients responding and not responding to checkpoint inhibitors, and their composition in terms of cell states, confirms the observation of the original study that TCF7 correlates with checkpoint therapy responsivenss.
However, the enriched TCF7+ CD8 TIL population corresponds to a naive-like/central memory-like state that does not resembles the (Pdcd1+ Tox+) Precursor exhausted state well described in murine cancer and chronic infections. We also observed that non-responding biopsies are enriched in regulatory CD4 T cells and deplteted from helper T cells, which might also contribute to the reduced anti-tumor response following immunotherapy. ProjecTILs analysis in the context of a stable atlas gave a more complete picture of the T cell states that are associated with immunotherapy response.
Further reading
Original publication - Sade-Feldman et al. (2018) Cell
ProjecTILs case studies - INDEX - Repository